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Abstract 

Farsi handwrite character recognition is a main topic in pattern recognition, machine learning, 
image processing, machine vision and data mining. Handwrite character recognition has many 
applications such as licenses plate recognition, document annotation by keywords, postal code 
recognition, bank check processing, entry score system in university. In handwrite recognition 
confront with some difficult such as different type, written with different pressure, using a thick 
or thin pen. In general there are three major stages in the character handwrite recognition 
problem: (1) preprocessing that proceed to normalization, noise removing and segmentation, (2) 
Feature extraction tries to replace the image with numerical feature vector in order to describe 
image as well. (3) Classification phase try to recognition of handwrite character with high 
accuracy by using extracted feature. In this research, we using of Structural features and some of 
character numeral feature to recognition handwrite digit. Percent of recognition of this method 
for handwrite digits achieved 98.18% . ™ 
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I. Introduction 

Recognizing handwritten numbers is one of the significant problems within the area of light 
letters cognition, which is of much application. As an example for handwritten numbers 
recognition systems applications, reference could be made to reading digital data put into forms. 
The use of a practical handwritten numbers recognition system faces a number of challenges of 
which the most important would be the necessity for a high level of recognition rate. In the field 
of Persian language, simply because of great similarity among digits in addition to differences in 
the way they are written, creating a recognition system with an acceptable degree of exactitude 
for practical purposes comes across some difficulties. It is for this reason that expanding methods 
to improve their precision would be of necessity. During the last few years, various works have 
been done over the issue of Persian and Arabic handwritten numbers and letters' recognition. In 
a piece of research conducted by Darvish et al, a shape congruence algorithm is made use of to 
recognize Persian handwritten digits. For each and every sampling point on the shape's 
connector, description is arrived at by means of the placement distribution of other points' 
connector(s) [1]. In another piece of research by Parvin et al, another method for improving upon 
the functionality of recognition system has been put forward. The original idea in the suggested 
method would be use of classifiers on a binary basis [2]. In yet another research by Alizadeh et 
al, some methodology based on genetic algorithm to make a neural network grouping using a 
classifying pick-up method of giving weights based on opinion has been propounded [3]. The 
research conducted by Shahabi and Rahmati, use has been made of Gabor filter banks which is 
suitable for the construction of Persian handwritten texts in addition to visual system [4]. Still in 
another work by Parvin et al, application has been made of categorizing even classifiers to boost 
this group of classifiers. These can reduce the error rate for more precision in features space [5]. 
Another research has used Bays' classification moment torque to recognize Persian handwritten 
digits [6]. Masroori has applied dynamic temporal torsion algorithm to recognize the numbers 



[7]. 



What we have done is based on bringing out a series of constructional features from 



amongst the set of handwritten numbers. These features are the existence of an enclosed space 
within the digit, branching-out and terminal points, the directionality of semi-circles and the 
degree of pixel density in various areas. Later on, the genealogical decision-making tree would 
have been created upon such features to be evaluated. 
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Mention would be made of what has been said in various sections as this article goes on. In 
the second part, the focus of attention has been on introducing the dataset applied. In the third 
part, the preprocessing stage is brought to attention which includes how to drop noise and extract 
digit skeleton. In the fourth part, we shall talk of the methodology of bringing out features. The 
fifth part is devoted to classification stage. In the end, will come the results out of the suggested 
model. 

2. Hoda dataset features 

Hoda database which is documented upon by researchers is also of good use in this 
research. The Hoda handwritten numbers set is the first sizable Persian handwritten numbers 
comprising of 102353 samples of black and white handwritten digits. This set has been made 
during a Master's degree project concerning the recognition of handwritten forms [8]. The data 
within this set have been extracted from something around 12000 registration forms of M.Sc. 
entrance examinations of 2005 in addition to the Associate's examinations in Applied and 
Science Comprehensive University in the year 2004. The properties in this dataset are as follows: 



The sample separablility degree: 200 points per inch 
The total number of samples: 102352 
Educational samples numbers: 6000 samples of each class 
Experimental samples number: 2000 samples in each class 
Other samples: 22352 ^M^^^^^^^^^ 



3. Pre-processing Step 

Preprocessing consists of two stages of making figures size identical and framing each 
digit to be transformed into much less pixels. Identicalization is done through converting all 
figures to 90 rows and 70 columns of pixels. In addition, each frame size has been considered to 
be 10x10. The considered frame is moved on the image. If there are more than 40 bright pixels in 
each area, the pixel quantity is black; otherwise, white pixel is reflected back. So that (based on 
figure 3), a 90x70 image is converted to 9x7. 

Figure 2: Preprocessing & normalization of handwritten digit of three 
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4. Feature Extraction 



We put together dents of each figure in four possible directions (left, right, up and down) 
and attain a dataset with 32 features for each digit (Figure 3). You can see the feature vector of 
number 3 in table 4. 

Figure 3: Image view of number 3 from four directions 



Figure 4: Extracting feature vector from the image of digit 4 3' 
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5. Back-Propagation Neural Networks 

There are so many methods for training network and out banding the weights so as to reach a 
significant error. One of the most famous of these methods is error back propagation algorithm. 
This algorithm which was proposed by Rommel Hart & Mac Kelly Land in 1986, are applied in 
feed forward neural network. Feed forward means that artificial neural neurons have been placed 
in successive layers and send their output (signal) forward. Morover, the word "back- 
propagation" spells that the error is fed back into the network to outbalance the weights and to, 
then, reiterate the entrance to its in-route down to the exit. The back-propagation procedure on 
errors is one of supervisory methods which mean that: the in-specimen has been tagged and the 
out- ones expected of each of them are already known. As a result, the network's output has been 
compared against these idealized exits simply in order to calculate the error on the network. The 
network's weights are supposed to have been opted randomly in this algorithm from the word go. 
The network's output is calculated in each step to be corrected according to its discrepancy with 
the idealized output so much so that the said error comes down to its minimal at the end [9]. 



Figure 5: The backward propagation network architecture (the entrance layer being X, the 
hidden layer being Z, and the exit layer nominated Y. w and v are weights inside the network) 
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As to the continuation of this work, algoritsms relating to the backpropagation neural 
network (figure 6) and the recognition of the pattern under evaluation in this type of network 
(figure 7) have been presented [10]. 



Figure 6: Back-propagation network educational algorithm 



Step 0. Set o, a 

Initialize Weights W and V with small random 

values 

n: Input 

p: Hidden units 

m: Output units 
Stepl. While MSE done reach to a small value or 

for many times do step 2-10 
Step2. Set X as input neurons 
Step3. Find Zin , Z 

For (j=l..p) 

Zinj = ZUXiVi; 

Zj = f(Zinj) 
Z = l 

Step4. Find yin, y 
For(k = 1 .. m) 

Yin k =£/UZjW,k 

Y k = f (Yin k ) 
Step5. Find 6K k 
K = 1 .. m 

5K k = -[t k - y k ].o.y k (l-y k ) 
Step5. Find 5jj 
j = 1 .. p 

6K k = C" Sfc=i(«K k W jk ) . a. Zj . (1 - Zj)) 

Step7. Find AVy by: 
i = 0..n, j = l..p 
AVy = a. SJj.Xi 
Step8. Find AW jk by: 
j = l..p, k = l..m 
AW jk = a.6K k .Zj 
Step9. Find V y by: 
i = 0..n, j = l..p 
Vij = Vij + AVij 
SteplO. Find W y by: 
j = l..p, k = l..m 
w jk = Wik + Awn 
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Figure 7: algorithm to test a sample (in the very stage of be in tested) inside the neural network 



StepO. Set o, a 

Initialize Weights W and V with small random 

values 

n: Input 

p: Hidden units 

m: Output units 
Stepl. Set X as Unknown Pattern 
Step2. Find Zin, Z 

For = L.p) 

Zinj = Y^XiVq 

Zj = f(Zi nj ) 

Z = l 
Step3. Find yin, y 

For (k = l..m) 

Yin k =^ZiW,k 

Y k = f(Yin k ) 



6. Actualization Results 



4500 samples of the data set of Hoda were extracted prior to the start of the simulation, the 
entrance data having been classified into two groups by us. 
1 . Educational Data: these were employed from among the label data. Of the total number of 3000 
of specimen data (i.e., around 67%) were selected at random, having been put to use as 
educational data. Upon the network having been educated by these data, the weights found their 
final measure so that the network came down to the lowest possible minimal of 3.2237 errors. 



Table 1: Results of performance and amount of correct examined samples of each class 
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13 


91% 
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4 


97% 


sum 


1357 


143 


91.53% 



Table 2: Value of used parameters 



variable 


Initial value 


a 


2 


a 


0.5 


n 


32 


P 


24 


m 


10 


Number of execute 




algorithm 


" 100000 times 




Random amount 


Wight amount (v,w) 


between and 1 


layers Number 


3 


Used sigmoid function 


fix) = 1/(1 + e"*) 







7. Conclusion Jf" % Tr I B^t JJ^^T. 

It was on the Hoda data set that the suggestive algoritms set were downloaded, put into practice, 
and assayed. The number of neurons in the hidden layer was spotted (24 neurons) trough the 
method of trial-and-error. The average bundling came out to be 100% for the educational data, 
and about 89.18% for experimental data. 



A Monthly Double-Blind Peer Reviewed Refereed Open Access International e-Journal - Included in the International Serial Directories 
Indexed & Listed at: Ulrich's Periodicals Directory ©, U.S.A., MlMiPlefy^ as well as in Cabell's Directories of Publishing Opportunities, U.S.A. 

International Journal of Management, IT and Engineering 

http://www.ijmra.us 




July 

2013 




ISSN: 2249-0558 



References 

[l] Mohammad nahvi, koorosh kiani, reza ebrahimpour "Improvement of Gradient Trait Derivation 

Method Based on Discrete Cosine Transform In Order to Recognition of Persian Handwritten 

Digits" 18th International Conference on Computer, Iran, 2010. 
[2] Alireza Darvish , Ehsanollah Kabir , Hossein Khosravi, "Recognition of Handwritten Farsi 

Digits by Shape Matching"; 1 1th CSI Computer Conference; Iran; 2005. 
[3] Hossein khosravi "recogniton of persian hanwritten digits and letters from registration form of 

university's entrance examination", Master thesis of electronics, Tarbiat Modares University, 

2005. 

[4] A. R. Sabri and M. A. Sameh, "Recognition of Off-Line Handwritten Arabic (Indian) Numerals 
Using Multi-Scale Features and Support Vector Machines Vs. Hidden Markov Models", The 
Arabian Journal for Science and Engineering, Volume 34, Number: 2B, May 27, pp. 429-444, 
2009. 

[5] A. R. Darvish, E. Kabir and H. Khosravi, "Application of shape adaptation in Recognition of 
Handwritten Farsi Digits", journal of technical & engineering modarres, Vol. 22, pp. 37-47, 
2006. 

[6] H. Parvin, H. Alizadeh, M. Moshki, B. Minaei-Bidgoli and N. Mozayani, "Divide & Conquer 
Classification and Optimization by Genetic Algorithm", Third International Conference on 
Convergence and Hybrid Information Technology, Iran, 2008. ^H^^^^ jk 

[7] H. Alizadeh, H. Parvin and B. Minaei-Bidgoli, "A New Approach to Improve the Vote-Based 
Classifier Selection", Fourth International Conference on Networked Computing and Advanced 
Information Management, Iran, 2008. I I TL MmmmH i 

[8] F. Shahabi and M. Rahmati, "A New Method for Writer Identification of Handwritten Farsi 
Documents", 10th International Conference on Document Analysis and Recognition, Iran, 2009. 

[9] O.Sayyadi, "Preliminary Produce about Artificial Neural Networks" Master thesis of electronics, 

Sharif University of technology, 2008. 
[10] A. Mehr-joui "Artificial Neural Networks" edition number: 1, Islam-Sharh, Islamic Azad 
University Islam-Shahr Branch, 2007. 



A Monthly Double-Blind Peer Reviewed Refereed Open Access International e-Journal - Included in the International Serial Directories 
Indexed & Listed at: Ulrich's Periodicals Directory ©, U.S.A., MlMiPlefy^ as well as in Cabell's Directories of Publishing Opportunities, U.S.A. 

International Journal of Management, IT and Engineering 

http://www.ijmra.us 




July 

2013 




ISSN: 2249-0558 



erll400: salam salare gham? 

erl 1400: mahdi . haghighat @ gmail .com 
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